A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech
نویسنده
چکیده
This paper presents a time-scale pitch-scale modification technique for concatenative speech synthesis. The method is based on a frequency domain source-filter model, where the source is modeled as a mixed excitation. This model is highly coupled with a compression scheme that result in compact acoustic inventories. When compared to the approach in the Whistler system using no mixed excitation, the new method shows improvement in voiced fricatives and over-stretched voiced sounds. In addition, it allows for spectral manipulation such as smoothing of discontinuities at unit boundaries, voice transformations or loudness equalization.
منابع مشابه
Enhanced shape-invariant pitch and time-scale modification for concatenative speech synthesis
To preserve shape-invariance when pitch or time-scale modifying sinusoidally modelled voiced speech, the phases of the sinusoids used to model the glottal excitation are made to add coherently at estimated excitation points. Previous methods achieve this by estimating excitation phases at synthesis frame boundaries, disregarding the frequency modulation that may occur between the frame boundary...
متن کاملVLSI implementation of a TSM/FSM algorithm
The time scale modification (TSM) of speech is concerned with the compressing or expanding of audio signals in the time domain without affecting the signals pitch or naturalness. Conversely, the frequency scale modification (FSM) of speech is concerned with altering the pitch and formants of a signal without changing the signal duration. This paper describes a hardware implemented and optimized...
متن کاملShape-invariant pitch and time-scale modification of speech by variable order phase interpolation
To preserve the waveform shape and perceived quality of pitch and time-scale modified sinusoidally modelled voiced speech, the phases of the sinusoids used to model the glottal excitation are made to add coherently at estimated pitch pulse locations. The glottal excitation is therefore made to resemble a pseudoperiodic impulse train, a quality essential for shape-invariance. Conventional method...
متن کاملNon-parametric techniques for pitch-scale and time-scale modification of speech
Time-scale and, to a lesser extent, pitch-scale modifications of speech and audio signals are the subject of major theoretical and practical interest. Applications are numerous, including, to name but a few, text-to-speech synthesis (based on acoustical unit concatenation), transformation of voice characteristics, foreign language learning but also audio monitoring or film/soundtrack post-synch...
متن کاملVoice source analysis for pitch-scale modification of speech signals
Much research has shown that the voice source has strong influence on the quality of speech processing [4][5][6]. But in most of the existing speech modification algorithms, the effect of the voice source variation is neglected. This work explains why the existing modification scheme can’t truly reflect the voice source variation during pitch modification. We use synthesized voiced speech sound...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998